Machine Learning-based systems for the automation of systematic literature reviews in food safety domain

Project facts

Project promoter:
Warsaw University of Technology(PL)
Project Number:
PL-Applied Research-0019
Status:
Completed
Final project cost:
€1,242,947
Donor Project Partners:
Norwegian Institute of Public Health(NO)
Oslo Metropolitan University(NO)
Other Project Partners
National Institute of Public Health - National Institute of Hygiene(PL)
Tecna Sp. z o. o.(PL)
Programme:

Description

The main goal of the project is to provide a tool which will help the authorities and researchers to process a big amount of literature in a reasonable time. To achieve that goal the tool must be based on ontologies and various artificial intelligence techniques such as natural language processing and various types of classifiers.  To formulate the correct polices, the relevant institutions need to monitor the current state of knowledge in the particular field of interest. If the process of extracting knowledge is realized manually, it requires a lot of effort and time connected with searching for the potentially adequate literature, filtering the valuable positions and finally extracting the useful knowledge from the chosen sources. With the proposed tool it will be possible to save time and increase the efficiency of the literature review at each of the aforementioned steps. That in turn will enable the authorities researchers to formulate more accurate decisions in a shorter time. NIPH-NSCFE (donor partner) and NIPH-NIH (partner) will directly benefit by getting tools for facilitating systematic literature reviews they do routinely. The other partners will enhance their expertise in the field of artificial intelligence by providing advanced tools.

Summary of project results

The project was devoted to developing tools based on machine learning techniques for automating systematic literature reviews in food safety domain. The aim was to create an application which could use semantic information included in articles to be classified as relevant or irrelevant with respect to the topic which initiates a systematic literature review. The outcome of the project is a system based on open source software which supports systematic literature reviews in food safety domain. The tool is accessible by Internet and does not require from its users deep understanding of artificial intelligence methods applied in it.

The aim of the REFSA project was to provide tools for supporting systematic literature reviews in food safety domain. The work carried out in REFSA project concentrated on building the web-based application for the automation of systematic literature reviews. The application has Graphical User Interfaces for: initialization of the Systematic Literature Review Process by introducing the initial query sentence; formulating bibliographic databases queries according to PICO structure; introducing by reviewers labels for articles manually screened; presenting the results of SLR in the form of pools of relevant / irrelevant articles. The application processing pipeline consists of steps such as: query formulation for retrieving articles from bibliographic databases; articles retrieval; articles preprocessing with the aim of their representations as numerical vectors; articles annotation; building classification model; classifying articles as relevant and irrelevant with respect to the initial query. The Implemented pipeline realizes an active learning approach which means that during the classification process users of the proposed system are obliged to classify some articles manually. In addition to building web-based application research was carried out on the subject of accuracy of classification models based on different text embeddings including those which are based on using articles annotations with the help of ontologies. The result of the project is a unique system supporting systematic literature reviews, which is equipped with both traditional text analysis and data classification tools, as well as software enabling semantic analysis of texts based on ontologies. Moreover, the system simultaneously implements two approaches: one using active learning in the classification process; the second one involves building a ranking list based on a citation network. The built system has a modular structure, it is possible to add new modules in the future, or to change the existing ones if progress in machine learning tools would result in greater effectiveness of the system. In particular, the functionality of the system could be extended with a text analysis module using large language models. The system is dedicated to systematic literature reviews in the food safety domain, but it can be used in other fields provided that ontologies for these fields exist.

The result of the REFSA project is a modular system for supporting systematic literature reviews based on machine learning tools. Systematic literature reviews are used in many fields, primarily in medicine and food safety. The application is designed in such a way that it can be used in areas other than the one for which it was designed. It is planned to expand the application (adding new modules) by adding its own bibliographic database search engine and by adding functionality based on large language models. It is planned to use the application in the work of the National Institute of Public Health. Additionally, the development of a cybersecurity support system is being considered, including a module for analyzing the content posted on websites, which would classify documents in order to identify information requiring attention due to national security. This module would use elements of the REFSA system equipped with a cybersecurity ontology.

Summary of bilateral results

The project consortium consisted of teams with complementary competences. This applies to both the Polish and Norwegian parts of theconsortium. NSCFE is Norway''s leading institution for food safety, which regularly conducts systematic reviews of the literature in this field. Oslo Metropolitan University is a leading institution in Norway that conducts advanced research in the artificial intelligence domain, in particular in the field of creating graph neural networks in data analysis. The competences of the Norwegian partners matched the needs related to the implementation of the project very well. The aim of the project was to develop an application supporting the work of experts performing systematic literature reviews in the field of food safety. When building such an application, it was important to obtain information related to the experience of people performing such reviews, both for the development of an appropriate user interface and for in-depth knowledge of the systematic literature review process. Collaborating with NSCFE made it possible to achieve both of these goals. The established partnership between Polish and Norwegian parts of the consortium provides for an excellent basement for future cooperation based on improved knowledge and mutual understanding, increased visibility, etc. Continued cooperation is planned. The REFSA team intends to apply for funding in order to upgrade the REFSA system by the module based on LLMs. The bilateral collaboration has a good potential to be extended to the regional and/or European level (towards EU and its institutions). The institutions that are part of the REFSA consortium plan to obtain financing for the continuation of research carried out under the EFSA project in Horizon Europe programs.

Information on the projects funded by the EEA and Norway Grants is provided by the Programme and Fund Operators in the Beneficiary States, who are responsible for the completeness and accuracy of this information.